Multi-modal and Cross-Modal for Lecture Videos Retrieval
Identifieur interne : 000063 ( Main/Exploration ); précédent : 000062; suivant : 000064Multi-modal and Cross-Modal for Lecture Videos Retrieval
Auteurs : Nhu-Van Nguyen [France] ; Mickaël Coustaty [France] ; Jean-Marc Ogier [France]Source :
Abstract
The problem of multi-modal and cross-modal lecture videos retrieval is studied in this paper, on the basis of the use of document analysis techniques. In the context of this paper, a lecture video is represented by a set of subjects, in which a subject is represented by a Bag of mixed words -visual words and textual words-, each of them coming from speech recognition and OCR engines. Our work relies on two assumptions 1) a video may contain multiple subjects, 2) multiple modalities exist in the same lecture video document. We propose in this research a combination of technologies issuing from image document analysis and text mining. Visual words and textual words in images of lecture slides are extracted based on text detection and graphics localization computed on the sequences captured with a camera. Assuming that a subject in the video composes of a set of slides, lecture slides are clustered in different groups representing different possible subjects by using mixed words extracted. Multimodal and cross-modal lecture video retrieval are realized by the Bag of Subjects model. We discuss the proposed indexing and retrieval approach for lecture videos and report a quantitative evaluation on lecture videos of our University. It is shown that using Bag of Subjects for lecture video retrieval improves the retrieval accuracy.
Url:
DOI: 10.1109/ICPR.2014.461
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Hal, to step Corpus: 000084
- to stream Hal, to step Curation: 000084
- to stream Hal, to step Checkpoint: 000024
- to stream Main, to step Merge: 000063
- to stream Main, to step Curation: 000063
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Multi-modal and Cross-Modal for Lecture Videos Retrieval</title>
<author><name sortKey="Nguyen, Nhu Van" sort="Nguyen, Nhu Van" uniqKey="Nguyen N" first="Nhu-Van" last="Nguyen">Nhu-Van Nguyen</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-40831" status="VALID"><orgName>Laboratoire Informatique, Image et Interaction</orgName>
<orgName type="acronym">L3I</orgName>
<desc><address><addrLine>Bâtiment Pascal Avenue Michel Crépeau F-17042 La Rochelle Cedex 1</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lr.fr/l3i</ref>
</desc>
<listRelation><relation name="EA2118" active="#struct-300311" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="EA2118" active="#struct-300311" type="direct"><org type="institution" xml:id="struct-300311" status="VALID"><orgName>Université de La Rochelle</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city">La Rochelle</settlement>
<region type="region" nuts="2">Poitou-Charentes</region>
</placeName>
<orgName type="university">Université de La Rochelle</orgName>
</affiliation>
</author>
<author><name sortKey="Coustaty, Mickael" sort="Coustaty, Mickael" uniqKey="Coustaty M" first="Mickaël" last="Coustaty">Mickaël Coustaty</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-40831" status="VALID"><orgName>Laboratoire Informatique, Image et Interaction</orgName>
<orgName type="acronym">L3I</orgName>
<desc><address><addrLine>Bâtiment Pascal Avenue Michel Crépeau F-17042 La Rochelle Cedex 1</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lr.fr/l3i</ref>
</desc>
<listRelation><relation name="EA2118" active="#struct-300311" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="EA2118" active="#struct-300311" type="direct"><org type="institution" xml:id="struct-300311" status="VALID"><orgName>Université de La Rochelle</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city">La Rochelle</settlement>
<region type="region" nuts="2">Poitou-Charentes</region>
</placeName>
<orgName type="university">Université de La Rochelle</orgName>
</affiliation>
</author>
<author><name sortKey="Ogier, Jean Marc" sort="Ogier, Jean Marc" uniqKey="Ogier J" first="Jean-Marc" last="Ogier">Jean-Marc Ogier</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-40831" status="VALID"><orgName>Laboratoire Informatique, Image et Interaction</orgName>
<orgName type="acronym">L3I</orgName>
<desc><address><addrLine>Bâtiment Pascal Avenue Michel Crépeau F-17042 La Rochelle Cedex 1</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lr.fr/l3i</ref>
</desc>
<listRelation><relation name="EA2118" active="#struct-300311" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="EA2118" active="#struct-300311" type="direct"><org type="institution" xml:id="struct-300311" status="VALID"><orgName>Université de La Rochelle</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city">La Rochelle</settlement>
<region type="region" nuts="2">Poitou-Charentes</region>
</placeName>
<orgName type="university">Université de La Rochelle</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-01247962</idno>
<idno type="halId">hal-01247962</idno>
<idno type="halUri">https://hal.archives-ouvertes.fr/hal-01247962</idno>
<idno type="url">https://hal.archives-ouvertes.fr/hal-01247962</idno>
<idno type="doi">10.1109/ICPR.2014.461</idno>
<date when="2014-08-24">2014-08-24</date>
<idno type="wicri:Area/Hal/Corpus">000084</idno>
<idno type="wicri:Area/Hal/Curation">000084</idno>
<idno type="wicri:Area/Hal/Checkpoint">000024</idno>
<idno type="wicri:Area/Main/Merge">000063</idno>
<idno type="wicri:Area/Main/Curation">000063</idno>
<idno type="wicri:Area/Main/Exploration">000063</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Multi-modal and Cross-Modal for Lecture Videos Retrieval</title>
<author><name sortKey="Nguyen, Nhu Van" sort="Nguyen, Nhu Van" uniqKey="Nguyen N" first="Nhu-Van" last="Nguyen">Nhu-Van Nguyen</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-40831" status="VALID"><orgName>Laboratoire Informatique, Image et Interaction</orgName>
<orgName type="acronym">L3I</orgName>
<desc><address><addrLine>Bâtiment Pascal Avenue Michel Crépeau F-17042 La Rochelle Cedex 1</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lr.fr/l3i</ref>
</desc>
<listRelation><relation name="EA2118" active="#struct-300311" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="EA2118" active="#struct-300311" type="direct"><org type="institution" xml:id="struct-300311" status="VALID"><orgName>Université de La Rochelle</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city">La Rochelle</settlement>
<region type="region" nuts="2">Poitou-Charentes</region>
</placeName>
<orgName type="university">Université de La Rochelle</orgName>
</affiliation>
</author>
<author><name sortKey="Coustaty, Mickael" sort="Coustaty, Mickael" uniqKey="Coustaty M" first="Mickaël" last="Coustaty">Mickaël Coustaty</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-40831" status="VALID"><orgName>Laboratoire Informatique, Image et Interaction</orgName>
<orgName type="acronym">L3I</orgName>
<desc><address><addrLine>Bâtiment Pascal Avenue Michel Crépeau F-17042 La Rochelle Cedex 1</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lr.fr/l3i</ref>
</desc>
<listRelation><relation name="EA2118" active="#struct-300311" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="EA2118" active="#struct-300311" type="direct"><org type="institution" xml:id="struct-300311" status="VALID"><orgName>Université de La Rochelle</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city">La Rochelle</settlement>
<region type="region" nuts="2">Poitou-Charentes</region>
</placeName>
<orgName type="university">Université de La Rochelle</orgName>
</affiliation>
</author>
<author><name sortKey="Ogier, Jean Marc" sort="Ogier, Jean Marc" uniqKey="Ogier J" first="Jean-Marc" last="Ogier">Jean-Marc Ogier</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-40831" status="VALID"><orgName>Laboratoire Informatique, Image et Interaction</orgName>
<orgName type="acronym">L3I</orgName>
<desc><address><addrLine>Bâtiment Pascal Avenue Michel Crépeau F-17042 La Rochelle Cedex 1</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lr.fr/l3i</ref>
</desc>
<listRelation><relation name="EA2118" active="#struct-300311" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="EA2118" active="#struct-300311" type="direct"><org type="institution" xml:id="struct-300311" status="VALID"><orgName>Université de La Rochelle</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city">La Rochelle</settlement>
<region type="region" nuts="2">Poitou-Charentes</region>
</placeName>
<orgName type="university">Université de La Rochelle</orgName>
</affiliation>
</author>
</analytic>
<idno type="DOI">10.1109/ICPR.2014.461</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">The problem of multi-modal and cross-modal lecture videos retrieval is studied in this paper, on the basis of the use of document analysis techniques. In the context of this paper, a lecture video is represented by a set of subjects, in which a subject is represented by a Bag of mixed words -visual words and textual words-, each of them coming from speech recognition and OCR engines. Our work relies on two assumptions 1) a video may contain multiple subjects, 2) multiple modalities exist in the same lecture video document. We propose in this research a combination of technologies issuing from image document analysis and text mining. Visual words and textual words in images of lecture slides are extracted based on text detection and graphics localization computed on the sequences captured with a camera. Assuming that a subject in the video composes of a set of slides, lecture slides are clustered in different groups representing different possible subjects by using mixed words extracted. Multimodal and cross-modal lecture video retrieval are realized by the Bag of Subjects model. We discuss the proposed indexing and retrieval approach for lecture videos and report a quantitative evaluation on lecture videos of our University. It is shown that using Bag of Subjects for lecture video retrieval improves the retrieval accuracy.</div>
</front>
</TEI>
<affiliations><list><country><li>France</li>
</country>
<region><li>Poitou-Charentes</li>
</region>
<settlement><li>La Rochelle</li>
</settlement>
<orgName><li>Université de La Rochelle</li>
</orgName>
</list>
<tree><country name="France"><region name="Poitou-Charentes"><name sortKey="Nguyen, Nhu Van" sort="Nguyen, Nhu Van" uniqKey="Nguyen N" first="Nhu-Van" last="Nguyen">Nhu-Van Nguyen</name>
</region>
<name sortKey="Coustaty, Mickael" sort="Coustaty, Mickael" uniqKey="Coustaty M" first="Mickaël" last="Coustaty">Mickaël Coustaty</name>
<name sortKey="Ogier, Jean Marc" sort="Ogier, Jean Marc" uniqKey="Ogier J" first="Jean-Marc" last="Ogier">Jean-Marc Ogier</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000063 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000063 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Hal:hal-01247962 |texte= Multi-modal and Cross-Modal for Lecture Videos Retrieval }}
This area was generated with Dilib version V0.6.32. |